838 research outputs found
Why We Read Wikipedia
Wikipedia is one of the most popular sites on the Web, with millions of users
relying on it to satisfy a broad range of information needs every day. Although
it is crucial to understand what exactly these needs are in order to be able to
meet them, little is currently known about why users visit Wikipedia. The goal
of this paper is to fill this gap by combining a survey of Wikipedia readers
with a log-based analysis of user activity. Based on an initial series of user
surveys, we build a taxonomy of Wikipedia use cases along several dimensions,
capturing users' motivations to visit Wikipedia, the depth of knowledge they
are seeking, and their knowledge of the topic of interest prior to visiting
Wikipedia. Then, we quantify the prevalence of these use cases via a
large-scale user survey conducted on live Wikipedia with almost 30,000
responses. Our analyses highlight the variety of factors driving users to
Wikipedia, such as current events, media coverage of a topic, personal
curiosity, work or school assignments, or boredom. Finally, we match survey
responses to the respondents' digital traces in Wikipedia's server logs,
enabling the discovery of behavioral patterns associated with specific use
cases. For instance, we observe long and fast-paced page sequences across
topics for users who are bored or exploring randomly, whereas those using
Wikipedia for work or school spend more time on individual articles focused on
topics such as science. Our findings advance our understanding of reader
motivations and behavior on Wikipedia and can have implications for developers
aiming to improve Wikipedia's user experience, editors striving to cater to
their readers' needs, third-party services (such as search engines) providing
access to Wikipedia content, and researchers aiming to build tools such as
recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table
Detection of lipoarabinomannan (LAM) in urine is an independent predictor of mortality risk in patients receiving treatment for HIV-associated tuberculosis in sub-Saharan Africa: a systematic review and meta-analysis
BackgroundSimple immune capture assays that detect mycobacterial lipoarabinomannan (LAM) antigen in urine are promising new tools for the diagnosis of HIV-associated tuberculosis (HIV-TB). In addition, however, recent prospective cohort studies of patients with HIV-TB have demonstrated associations between LAM in the urine and increased mortality risk during TB treatment, indicating an additional utility of urinary LAM as a prognostic marker. We conducted a systematic review and meta-analysis to summarise the evidence concerning the strength of this relationship in adults with HIV-TB in sub-Saharan Africa, thereby quantifying the assay’s prognostic value.MethodsWe searched MEDLINE and Embase databases using comprehensive search terms for ‘HIV’, ‘TB’, ‘LAM’ and ‘sub-Saharan Africa’. Identified studies were reviewed and selected according to predefined criteria.ResultsWe identified 10 studies eligible for inclusion in this systematic review, reporting on a total of 1172 HIV-TB cases. Of these, 512 patients (44%) tested positive for urinary LAM. After a variable duration of follow-up of between 2 and 6months, overall case fatality rates among HIV-TB cases varied between 7% and 53%. Pooled summary estimates generated by random-effects meta-analysis showed a two-fold increased risk of mortality for urinary LAM-positive HIV-TB cases compared to urinary LAM-negative HIV-TB cases (relative risk 2.3, 95% confidence interval 1.6–3.1). Some heterogeneity was explained by study setting and patient population in sub-group analyses. Five studies also reported multivariable analyses of risk factors for mortality, and pooled summary estimates demonstrated over two-fold increased mortality risk (odds ratio 2.5, 95% confidence interval 1.4–4.5) among urinary LAM-positive HIV-TB cases, even after adjustment for other risk factors for mortality, including CD4 cell count.ConclusionsWe have demonstrated that detectable LAM in urine is associated with increased risk of mortality during TB treatment, and that this relationship remains after adjusting for other risk factors for mortality. This may simply be due to a positive test for urinary LAM serving as a marker of higher mycobacterial load and greater disease dissemination and severity. Alternatively, LAM antigen may directly compromise host immune responses through its known immunomodulatory effects. Detectable LAM in the urine is an independent risk factor for mortality among patients receiving treatment for HIV-TB. Further research is warranted to elucidate the underlying mechanisms and to determine whether this vulnerable patient population may benefit from adjunctive interventions.Electronic supplementary materialThe online version of this article (doi:10.1186/s12916-016-0603-9) contains supplementary material, which is available to authorized users
Asynchronous Training of Word Embeddings for Large Text Corpora
Word embeddings are a powerful approach for analyzing language and have been
widely popular in numerous tasks in information retrieval and text mining.
Training embeddings over huge corpora is computationally expensive because the
input is typically sequentially processed and parameters are synchronously
updated. Distributed architectures for asynchronous training that have been
proposed either focus on scaling vocabulary sizes and dimensionality or suffer
from expensive synchronization latencies.
In this paper, we propose a scalable approach to train word embeddings by
partitioning the input space instead in order to scale to massive text corpora
while not sacrificing the performance of the embeddings. Our training procedure
does not involve any parameter synchronization except a final sub-model merge
phase that typically executes in a few minutes. Our distributed training scales
seamlessly to large corpus sizes and we get comparable and sometimes even up to
45% performance improvement in a variety of NLP benchmarks using models trained
by our distributed procedure which requires of the time taken by the
baseline approach. Finally we also show that we are robust to missing words in
sub-models and are able to effectively reconstruct word representations.Comment: This paper contains 9 pages and has been accepted in the WSDM201
Wavelet Based Fractal Analysis of Airborne Pollen
The most abundant biological particles in the atmosphere are pollen grains
and spores. Self protection of pollen allergy is possible through the
information of future pollen contents in the air. In spite of the importance of
airborne pol len concentration forecasting, it has not been possible to predict
the pollen concentrations with great accuracy, and about 25% of the daily
pollen forecasts have resulted in failures. Previous analysis of the dynamic
characteristics of atmospheric pollen time series indicate that the system can
be described by a low dimensional chaotic map. We apply the wavelet transform
to study the multifractal characteristics of an a irborne pollen time series.
We find the persistence behaviour associated to low pollen concentration values
and to the most rare events of highest pollen co ncentration values. The
information and the correlation dimensions correspond to a chaotic system
showing loss of information with time evolution.Comment: 11 pages, 7 figure
A survey of location inference techniques on Twitter
The increasing popularity of the social networking service, Twitter, has made it more involved in day-to-day communications, strengthening social relationships and information dissemination. Conversations on Twitter are now being explored as indicators within early warning systems to alert of imminent natural disasters such as earthquakes and aid prompt emergency responses to crime. Producers are privileged to have limitless access to market perception from consumer comments on social media and microblogs. Targeted advertising can be made more effective based on user profile information such as demography, interests and location. While these applications have proven beneficial, the ability to effectively infer the location of Twitter users has even more immense value. However, accurately identifying where a message originated from or an author’s location remains a challenge, thus essentially driving research in that regard. In this paper, we survey a range of techniques applied to infer the location of Twitter users from inception to state of the art. We find significant improvements over time in the granularity levels and better accuracy with results driven by refinements to algorithms and inclusion of more spatial features
Domain-independent Extraction of Scientific Concepts from Research Articles
We examine the novel task of domain-independent scientific concept extraction
from abstracts of scholarly articles and present two contributions. First, we
suggest a set of generic scientific concepts that have been identified in a
systematic annotation process. This set of concepts is utilised to annotate a
corpus of scientific abstracts from 10 domains of Science, Technology and
Medicine at the phrasal level in a joint effort with domain experts. The
resulting dataset is used in a set of benchmark experiments to (a) provide
baseline performance for this task, (b) examine the transferability of concepts
between domains. Second, we present two deep learning systems as baselines. In
particular, we propose active learning to deal with different domains in our
task. The experimental results show that (1) a substantial agreement is
achievable by non-experts after consultation with domain experts, (2) the
baseline system achieves a fairly high F1 score, (3) active learning enables us
to nearly halve the amount of required training data.Comment: Accepted for publishing in 42nd European Conference on IR Research,
ECIR 202
Recommended from our members
Mercury's Moment of Inertia from Spin and Gravity Data
Earth-based radar observations of the spin state of Mercury at 35 epochs between 2002 and 2012 reveal that its spin axis is tilted by (2.04 plus or minus 0.08) arc min with respect to the orbit normal. The direction of the tilt suggests that Mercury is in or near a Cassini state. Observed rotation rate variations clearly exhibit an 88-day libration pattern which is due to solar gravitational torques acting on the asymmetrically shaped planet. The amplitude of the forced libration, (38.5 plus or minus 1.6) arc sec, corresponds to a longitudinal displacement of ∼450 m at the equator. Combining these measurements of the spin properties with second-degree gravitational harmonics (Smith et al., 2012) provides an estimate of the polar moment of inertia of MercuryC/MR2 = 0.346 plus or minus 0.014, where M and R are Mercury's mass and radius. The fraction of the moment that corresponds to the outer librating shell, which can be used to estimate the size of the core, is Cm/C = 0.431 plus or minus 0.025
Rapid urine-based screening for tuberculosis in HIV-positive patients admitted to hospital in Africa (STAMP): a pragmatic, multicentre, parallel-group, double-blind, randomised controlled trial.
BACKGROUND
Current diagnostics for HIV-associated tuberculosis are suboptimal, with missed diagnoses contributing to high hospital mortality and approximately 374 000 annual HIV-positive deaths globally. Urine-based assays have a good diagnostic yield; therefore, we aimed to assess whether urine-based screening in HIV-positive inpatients for tuberculosis improved outcomes.
METHODS
We did a pragmatic, multicentre, double-blind, randomised controlled trial in two hospitals in Malawi and South Africa. We included HIV-positive medical inpatients aged 18 years or more who were not taking tuberculosis treatment. We randomly assigned patients (1:1), using a computer-generated list of random block size stratified by site, to either the standard-of-care or the intervention screening group, irrespective of symptoms or clinical presentation. Attending clinicians made decisions about care; and patients, clinicians, and the study team were masked to the group allocation. In both groups, sputum was tested using the Xpert MTB/RIF assay (Xpert; Cepheid, Sunnyvale, CA, USA). In the standard-of-care group, urine samples were not tested for tuberculosis. In the intervention group, urine was tested with the Alere Determine TB-LAM Ag (TB-LAM; Alere, Waltham, MA, USA), and Xpert assays. The primary outcome was all-cause 56-day mortality. Subgroup analyses for the primary outcome were prespecified based on baseline CD4 count, haemoglobin, clinical suspicion for tuberculosis; and by study site and calendar time. We used an intention-to-treat principle for our analyses. This trial is registered with the ISRCTN registry, number ISRCTN71603869.
FINDINGS
Between Oct 26, 2015, and Sept 19, 2017, we screened 4788 HIV-positive adults, of which 2600 (54%) were randomly assigned to the study groups (n=1300 for each group). 13 patients were excluded after randomisation from analysis in each group, leaving 2574 in the final intention-to-treat analysis (n=1287 in each group). At admission, 1861 patients were taking antiretroviral therapy and median CD4 count was 227 cells per μL (IQR 79-436). Mortality at 56 days was reported for 272 (21%) of 1287 patients in the standard-of-care group and 235 (18%) of 1287 in the intervention group (adjusted risk reduction [aRD] -2·8%, 95% CI -5·8 to 0·3; p=0·074). In three of the 12 prespecified, but underpowered subgroups, mortality was lower in the intervention group than in the standard-of-care group for CD4 counts less than 100 cells per μL (aRD -7·1%, 95% CI -13·7 to -0·4; p=0.036), severe anaemia (-9·0%, -16·6 to -1·3; p=0·021), and patients with clinically suspected tuberculosis (-5·7%, -10·9 to -0·5; p=0·033); with no difference by site or calendar period. Adverse events were similar in both groups.
INTERPRETATION
Urine-based tuberculosis screening did not reduce overall mortality in all HIV-positive inpatients, but might benefit some high-risk subgroups. Implementation could contribute towards global targets to reduce tuberculosis mortality.
FUNDING
Joint Global Health Trials Scheme of the Medical Research Council, the UK Department for International Development, and the Wellcome Trust
Electrons in High-Tc Compounds: Ab-Initio Correlation Results
Electronic correlations in the ground state of an idealized infinite-layer
high-Tc compound are computed using the ab-initio method of local ansatz.
Comparisons are made with the local-density approximation (LDA) results, and
the correlation functions are analyzed in detail. These correlation functions
are used to determine the effective atomic-interaction parameters for model
Hamiltonians. On the resulting model, doping dependencies of the relevant
correlations are investigated. Aside from the expected strong atomic
correlations, particular spin correlations arise. The dominating contribution
is a strong nearest neighbor correlation that is Stoner-enhanced due to the
closeness of the ground state to the magnetic phase. This feature depends
moderately on doping, and is absent in a single-band Hubbard model. Our
calculated spin correlation function is in good qualitative agreement with that
determined from the neutron scattering experiments for a metal.Comment: 21pp, 5fig, Phys. Rev. B (Oct. 98
Communication calls produced by electrical stimulation of four structures in the guinea pig brain
One of the main central processes affecting the cortical representation of conspecific vocalizations is the collateral output from the extended motor system for call generation. Before starting to study this interaction we sought to compare the characteristics of calls produced by stimulating four different parts of the brain in guinea pigs (Cavia porcellus). By using anaesthetised animals we were able to reposition electrodes without distressing the animals. Trains of 100 electrical pulses were used to stimulate the midbrain periaqueductal grey (PAG), hypothalamus, amygdala, and anterior cingulate cortex (ACC). Each structure produced a similar range of calls, but in significantly different proportions. Two of the spontaneous calls (chirrup and purr) were never produced by electrical stimulation and although we identified versions of chutter, durr and tooth chatter, they differed significantly from our natural call templates. However, we were routinely able to elicit seven other identifiable calls. All seven calls were produced both during the 1.6 s period of stimulation and subsequently in a period which could last for more than a minute. A single stimulation site could produce four or five different calls, but the amygdala was much less likely to produce a scream, whistle or rising whistle than any of the other structures. These three high-frequency calls were more likely to be produced by females than males. There were also differences in the timing of the call production with the amygdala primarily producing calls during the electrical stimulation and the hypothalamus mainly producing calls after the electrical stimulation. For all four structures a significantly higher stimulation current was required in males than females. We conclude that all four structures can be stimulated to produce fictive vocalizations that should be useful in studying the relationship between the vocal motor system and cortical sensory representation
- …